[VL] Make VeloxResizeBatchesExec inherit from ColumnarToColumnarExec to simplify the code#10763
Conversation
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
1 similar comment
|
Run Gluten Clickhouse CI on x86 |
ff91c8b to
cc971af
Compare
|
Run Gluten Clickhouse CI on x86 |
cc971af to
f664592
Compare
|
Run Gluten Clickhouse CI on x86 |
|
Run Gluten Clickhouse CI on x86 |
|
CC @zhztheplayer Thanks |
zhztheplayer
left a comment
There was a problem hiding this comment.
Thanks for the PR.
Would you like to help check the UI sanity, especially the metrics? Could post a screenshot here if possible? Thanks!
|
@zhztheplayer Sure, here is one metric description changed
|
zhztheplayer
left a comment
There was a problem hiding this comment.
LGTM. Thank you for the screenshot.
| def unapply(plan: SparkPlan): Option[SparkPlan] = { | ||
| plan match { | ||
| case c2c: ColumnarToColumnarTransition => | ||
| case c2c: ColumnarToColumnarTransition if !c2c.isSameConvention => |
There was a problem hiding this comment.
Sorry I missed out this change. Would you explain the purpose of it? Thanks.
There was a problem hiding this comment.
ColumnarToColumnarLike is used by the Coster to calculate C2C costs. Previously, VeloxResizeBatchesExec didn't inherit from ColumnarToColumnarTransition, so it wasn't included in the cost calculation. This change is to maintain the previous behavior.
Besides, in my testing, without this, ELECT max(l_orderkey) FROM lineitem will not generate a VeloxResizeBatchesExec node because it is eliminated after the cost calculation.
There was a problem hiding this comment.
Currently VeloxResizeBatchesExec is added by rule https://github.com/apache/incubator-gluten/blob/cd2c0cca9b9478a050bfbb90f15e75f99e7adcd2/backends-velox/src/main/scala/org/apache/gluten/extension/AppendBatchResizeForShuffleInputAndOutput.scala#L31-L59 without costers involved. Am I missing something? How is it eliminated after being added by the rule?
There was a problem hiding this comment.
@zhztheplayer , Not sure if my understanding is correct, in InsertTransitions we removeForNode ColumnarToColumnarLike, and then fillWithTransitions
case class InsertTransitions(convReq: ConventionReq) extends Rule[SparkPlan] {
private val convFunc = ConventionFunc.create()
override def apply(plan: SparkPlan): SparkPlan = {
// Remove all transitions at first.
val removed = RemoveTransitions.apply(plan)
val filled = fillWithTransitions(removed)
val out = Transitions.enforceReq(filled, convReq)
out
}but in VeloxBatchType we have no the Transitions, so after we remove node, we loss VeloxResizeBatchesExec
object VeloxBatchType extends Convention.BatchType {
override protected def registerTransitions(): Unit = {
fromRow(Convention.RowType.VanillaRowType, RowToVeloxColumnarExec.apply)
toRow(Convention.RowType.VanillaRowType, VeloxColumnarToRowExec.apply)
fromBatch(ArrowBatchTypes.ArrowNativeBatchType, ArrowColumnarToVeloxColumnarExec.apply)
toBatch(ArrowBatchTypes.ArrowNativeBatchType, Transition.empty)
}
}There was a problem hiding this comment.
It seems that the current c2c is automatically added, and resize is manually added, so we should not remove it. and whether velox resize should also be included in the cost
|
Merging, thank you @Zouxxyy. I felt there is opportunity to refactor the code to make it less impact the transition planner, but haven't got chance to experiment. I will revisit later. |
…tion It's an improvement following apache#10763. The PR decouples ColumnarToColumnarExec from ColumnarToColumnarTransition to make them both can be solely implemented. If a class inherits from ColumnarToColumnarExec but not from ColumnarToColumnarTransition at the same time, it will not be considered as a transition, so it's not going to be removed by RemoveTransitions rule.
…tion It's an improvement following apache#10763. The PR decouples ColumnarToColumnarExec from ColumnarToColumnarTransition to make them both can be solely implemented. If a class inherits from ColumnarToColumnarExec but not from ColumnarToColumnarTransition at the same time, it will not be considered as a transition, so it's not going to be removed by RemoveTransitions rule.

What changes are proposed in this pull request?
VeloxResizeBatchesExecinherit fromColumnarToColumnarExecto simplify the codeClosableIteratorto be generic.isSameConventioninColumnarToColumnarTransitionHow was this patch tested?